zero-shot reasoner
Large Language Models are Zero-Shot Reasoners
Pretrained large language models (LLMs) are widely used in many sub-fields of natural language processing (NLP) and generally known as excellent few-shot learners with task-specific exemplars. Notably, chain of thought (CoT) prompting, a recent technique for eliciting complex multi-step reasoning through step-by-step answer examples, achieved the state-of-the-art performances in arithmetics and symbolic reasoning, difficult system-2 tasks that do not follow the standard scaling laws for LLMs. While these successes are often attributed to LLMs' ability for few-shot learning, we show that LLMs are decent zero-shot reasoners by simply adding ``Let's think step by step'' before each answer. Experimental results demonstrate that our Zero-shot-CoT, using the same single prompt template, significantly outperforms zero-shot LLM performances on diverse benchmark reasoning tasks including arithmetics (MultiArith, GSM8K, AQUA-RAT, SVAMP), symbolic reasoning (Last Letter, Coin Flip), and other logical reasoning tasks (Date Understanding, Tracking Shuffled Objects), without any hand-crafted few-shot examples, e.g.
Large Language Models are Zero-Shot Reasoners
Pretrained large language models (LLMs) are widely used in many sub-fields of natural language processing (NLP) and generally known as excellent few-shot learners with task-specific exemplars. Notably, chain of thought (CoT) prompting, a recent technique for eliciting complex multi-step reasoning through step-by-step answer examples, achieved the state-of-the-art performances in arithmetics and symbolic reasoning, difficult system-2 tasks that do not follow the standard scaling laws for LLMs. While these successes are often attributed to LLMs' ability for few-shot learning, we show that LLMs are decent zero-shot reasoners by simply adding Let's think step by step'' before each answer. Experimental results demonstrate that our Zero-shot-CoT, using the same single prompt template, significantly outperforms zero-shot LLM performances on diverse benchmark reasoning tasks including arithmetics (MultiArith, GSM8K, AQUA-RAT, SVAMP), symbolic reasoning (Last Letter, Coin Flip), and other logical reasoning tasks (Date Understanding, Tracking Shuffled Objects), without any hand-crafted few-shot examples, e.g. The versatility of this single prompt across very diverse reasoning tasks hints at untapped and understudied fundamental zero-shot capabilities of LLMs, suggesting high-level, multi-task broad cognitive capabilities may be extracted by simple prompting.
Tokyo U & Google Brain Train Large Language Models as Zero-Shot Reasoners
Pretrained large language models (LLMs) are now scaled to more than 100B parameters and have revolutionized the field of natural language processing (NLP) with their excellent few-shot and zero-shot learning capabilities. However, although state-of-the-art LLMs make short work of system-1 tasks, they still struggle on system-2 tasks that require slow and multi-task reasoning. A research team from the University of Tokyo and Google Brain addresses this deficiency in their new paper Large Language Models are Zero-Shot Reasoners, which demonstrates that LLMs can become decent zero-shot reasoners through the addition of a simple prompt -- "Let's think step by step" -- that motivates a step-by-step thinking process before each question is answered. Their resulting Zero-shot-CoT (chain of thought prompting) model achieves huge performance gains compared to the zero-shot baseline. The division of human thinking into fast/automatic (system-1) and slow/rational (system-2) processes was proposed in the 2011 bestseller Thinking, Fast and Slow by psychologist Daniel Kahneman and has been widely adopted by machine learning researchers seeking to endow their models with more advanced and humanlike reasoning capabilities.